ANALYSIS OF THE PLAYERS DATASET¶

Data analysis from players,players valuations and clubs datasets¶

The first cell is just to set folder path and library used

Value of the players correlated to their position¶

We can observe one point in the Attack column that can be considered an outlier as it falls outside the usual bounds. All other players have market values within the range of 0-100 million. In conclusion, the market value of players is not significantly influenced by their position on the field. Notably, goalkeepers tend to have the lowest market values among all squad positions. Thus, we can infer that those who defend the soccer goal are generally the most undervalued.

Analysis of the outlier in the Attack column¶

Finding who is the player that has the highest market value in the Attack position that create the outlier in the graph above,just by searcing the highest value in the column the searching return Erling Haaland as the player with the highest market value in the attack position and also in all the players provided by the dataset

player_id first_name last_name name last_season current_club_id player_code country_of_birth city_of_birth country_of_citizenship ... foot height_in_cm market_value_in_eur highest_market_value_in_eur contract_expiration_date agent_name image_url url current_club_domestic_competition_id current_club_name
12249 418560 Erling Haaland Erling Haaland 2023 281 erling-haaland England Leeds Norway ... left 195.0 180000000.0 180000000.0 2027-06-30 00:00:00 Rafaela Pimenta https://img.a.transfermarkt.technology/portrai... https://www.transfermarkt.co.uk/erling-haaland... GB1 Manchester City

1 rows × 23 columns

As we can see Erling Haaland is the player with the highest market value in the attack position, and also in all the players provided by the dataset his value is way higher than the other players provided by the dataset

Analysis of the amount of players in each position¶

In this pie chart developed with the Plotly library, we can see the distribution of players in the dataset by position. The most represented position is Defender, accounting for 32.9% of the players, followed by Midfielders with 29.4%, Attackers with 27.4%, and finally, Goalkeepers with 10.2%. The data represents the count of players in the dataset grouped by their position and displayed as a percentage of all players. All segments of the pie chart are interactive and can be clicked to see the percentage of players in each position both including and excluding that position.

Grouping by team lineups¶

The analysis focuses on examining the distribution of players among various clubs. Initially, a field is created that combines the players’ first and last names to form the full name. Subsequently, the players are grouped by their current club, with a complete list of names for each club.

The result of the grouping is displayed in a table, clearly showing how many players belong to each club. For better visual understanding, an interactive bar chart is created that represents the number of players in each club. The chart allows users to see the number of players and, by hovering over each bar, view the players’ names. This analysis provides a clear overview of team compositions across different clubs.

        current_club_name                                          full_name
0    1.FC Heidenheim 1846  [Nikola Dovedan, Florian Pick, Tim Siersleben,...
1               1.FC Köln  [Sven Bacher, Kristian Pedersen, Jan Thielmann...
2          1.FC Nuremberg  [Enrico Valentini, Patrick Rakovsky, Hanno Bal...
3       1.FC Union Berlin  [Alexander Schwolow, Paul Seguin, Sheraldo Bec...
4          1.FSV Mainz 05  [Stephan Fürstner, Fabian Frei, Philipp Schulz...
..                    ...                                                ...
419      Yeni Malatyaspor  [Bugra Cagiran, Yakup Alkan, Yigithan Güveli, ...
420  Zenit St. Petersburg  [Anatoliy Tymoshchuk, Aleksandr Kerzhakov, Art...
421   Zirka Kropyvnytskyi  [Maksym Drachenko, Sergiy Kernozhytskyi, Oleks...
422         Zorya Lugansk  [Andriy Poltavtsev, Dmytro Myshnyov, Vladyslav...
423          Ümraniyespor  [Olarenwaju Kayode, Isaac Sackey, Yusuf Yardim...

[424 rows x 2 columns]

Top 10 players with the highest market value in the world in the last 10 years¶

The analysis focuses on the evolution of football players’ market value between 2013 and 2023. Initially, full names of the players are created by combining their first and last names. Subsequently, the data is filtered to include only the seasons from 2013 to 2023.

To identify the most valuable players, the top 10 players with the highest overall market value during this period are selected. These players are extracted from the original dataframe to create a new subset of data. The players are then sorted by their highest market value, and a column associating their full names is added.

An animated bar chart is created to visualize the market value evolution of these top 10 players over time. The chart is horizontally oriented and allows viewers to see the market value trend for each season from 2013 to 2023. The chart theme is set with a white background for better readability.

This dynamic visualization provides a clear overview of how the market value of the top players has changed over the years, offering valuable insights for analysts, fans, and industry professionals.

How the market value of the players has changed over the last 10 years¶

The analysis focuses on the evolution of football players’ market value between 2013 and 2023. Initially, a field is created that combines the players’ first and last names to form the full name. Subsequently, the data is filtered to include only the seasons from 2013 to 2023.

For each year, the top 10 players with the highest market value are identified. This data is then merged with the players’ details, creating a new dataset that contains information on the top 10 players for each year.

An animated box plot is created to visualize the market value evolution of these players over time. The plot shows market value on the Y-axis and the players’ names on the X-axis, with different colors for each player. The Y-axis range is set to include a margin beyond the maximum market value.

The plot is further enhanced with a theme that includes a white background and interactive controls to animate the visualization over time, allowing viewers to see the year-by-year evolution. The controls include buttons to play and pause the animation, as well as a slider to manually select the year.

This dynamic visualization provides a clear overview of how the market value of the top 10 players has changed from 2013 to 2023, offering valuable insights for analysts, fans, and industry professionals.

We are showing on that map the country of origin of the player (in dataset)¶

This data analysis focuses on the country of origin of the players in the dataset. The data is grouped by country of birth and shows the number of players from each country.

To visually represent the results, an interactive choropleth map is created using Folium. The world boundaries dataset is loaded using geopandas, and the player counts for each country are calculated and mapped to the country boundaries.

The map is initialized with a zoom level that shows the entire world. A choropleth layer is then added, which colors countries based on the number of players born in that country. The coloring ranges from light yellow to dark red, with more players corresponding to darker colors. The legend, positioned in the right corner, explains the meaning of the colors. Additionally, the map is interactive, allowing users to click on countries to view the number of players.

The aim of this analysis is to determine which countries have the most players born there and how these players are distributed globally. The result is a clear visual representation of the distribution of football players by country of origin.

/var/folders/tc/n9fflz493t1ckr05x_kz0pm00000gn/T/ipykernel_77889/3017363912.py:1: FutureWarning:

The geopandas.dataset module is deprecated and will be removed in GeoPandas 1.0. You can get the original 'naturalearth_lowres' data from https://www.naturalearthdata.com/downloads/110m-cultural-vectors/.

Make this Notebook Trusted to load map: File -> Trust Notebook

Data analysis how the must valued player take value in the market during the time¶

The data analysis focuses on the market value of a football player over time. To start, player data is merged with market valuation data using a common identifier. This allows for the combination of player demographic information with their historical market values.

Next, the player with the highest market value within the dataset is identified. Once identified, the data is filtered to include only the market valuations of this specific player.

To visualize the evolution of this player’s market value over time, an interactive line chart is created using Plotly. The chart trace shows the fluctuations in market value over time, with the X-axis representing the dates and the Y-axis representing the market value in euros. The chart layout is configured to include an informative title specifying the player’s name and to enhance readability with appropriate axis labels.

The resulting chart provides a clear visual representation of how the market value of the most valuable player has changed over time, offering valuable insights for analysts, fans, and industry professionals.

Data analysis how the less valued player take value in the market during the time¶

The data analysis focuses on the market value of a football player over time. To start, player data is merged with market valuation data using a common identifier. This allows for the combination of player demographic information with their historical market values.

Next, the player with the lowest market value within the dataset is identified. Once identified, the data is filtered to include only the market valuations of this specific player.

To visualize the evolution of this player’s market value over time, an interactive line chart is created using Plotly. The chart trace shows the fluctuations in market value over time, with the X-axis representing the dates and the Y-axis representing the market value in euros. The chart layout is configured to include an informative title specifying the player’s name and to enhance readability with appropriate axis labels.

The resulting chart provides a clear visual representation of how the market value of the player with the lowest value has changed over time, offering valuable insights for analysts, fans, and industry professionals.

Data analysis players with the highest yellow cards gruoped by country of birth¶

This data analysis focuses on the distribution of yellow cards received by football players, grouped by country of birth. To begin, player data is merged with appearance data to obtain a complete dataset. Subsequently, the total number of yellow cards for each country of birth is calculated.

World boundaries are loaded using geopandas, and the yellow card data is merged with the country geometries. A value of 0 is set for countries without (NaN) yellow card data.

An interactive choropleth map is created with Folium to visualize the results. The map displays countries colored based on the number of yellow cards received by players born in that country. The coloring ranges from light to dark, with more yellow cards corresponding to darker colors. The legend, positioned in the right corner, explains the meaning of the colors.

Finally, a title is added to the map to provide context. The resulting map offers a clear visual representation of the distribution of yellow cards among football players worldwide, providing valuable insights for analysts, fans, and industry professionals.

/var/folders/tc/n9fflz493t1ckr05x_kz0pm00000gn/T/ipykernel_77889/2831362913.py:15: FutureWarning:

The geopandas.dataset module is deprecated and will be removed in GeoPandas 1.0. You can get the original 'naturalearth_lowres' data from https://www.naturalearthdata.com/downloads/110m-cultural-vectors/.

Make this Notebook Trusted to load map: File -> Trust Notebook
/var/folders/tc/n9fflz493t1ckr05x_kz0pm00000gn/T/ipykernel_77889/1275748111.py:16: FutureWarning:

The geopandas.dataset module is deprecated and will be removed in GeoPandas 1.0. You can get the original 'naturalearth_lowres' data from https://www.naturalearthdata.com/downloads/110m-cultural-vectors/.

Make this Notebook Trusted to load map: File -> Trust Notebook

Data analysis means of players age in the last appereances gruped by country of birth¶

This data analysis focuses on the distribution of red cards received by football players, grouped by country of birth. Player data is merged with appearance data to obtain a complete dataset, which is then used to calculate the total number of red cards for each country of birth.

World boundaries are loaded using geopandas, and the red card data is merged with the country geometries. NaN values are set to 0 for countries without red card data.

An interactive choropleth map is created with Folium to visualize the results. The map displays countries colored based on the number of red cards received by players born in that country. The coloring ranges from yellow to green, with more red cards corresponding to darker colors. The legend, positioned in the right corner, explains the meaning of the colors.

Finally, a title is added to the map to provide context. The resulting map offers a clear visual representation of the distribution of red cards among football players worldwide, providing valuable insights for analysts, fans, and industry professionals.

/var/folders/tc/n9fflz493t1ckr05x_kz0pm00000gn/T/ipykernel_77889/106800031.py:23: FutureWarning:

The geopandas.dataset module is deprecated and will be removed in GeoPandas 1.0. You can get the original 'naturalearth_lowres' data from https://www.naturalearthdata.com/downloads/110m-cultural-vectors/.

Make this Notebook Trusted to load map: File -> Trust Notebook

Data analysis of the players with players with the highest market value in the world pre covid pandemic¶

This analysis focuses on the evolution of the market value of the player with the highest market value during the pre-Covid period (2017-2020). Here is a detailed description of the analysis and visualization process:¶

1.	Merging DataFrames:

The player valuation data is merged with the player demographic data using the player_id identifier.

2.	Identifying the Player with the Highest Market Value:

The player with the highest market value in the dataset is identified.

3.	Filtering Pre-Covid Data:

The data is filtered to include only the player’s valuations during the pre-Covid period, from January 1, 2017, to January 1, 2020.

4.	Date Formatting:

The ‘datetime’ column is converted to datetime format, and a new ‘date’ column is created containing only the date without the time. The data is sorted by date.

5.	Creating the Animated Plot:

Using Plotly, a figure is created that includes line plots with markers representing the player’s market value evolution over time. Animations are added to show the day-by-day evolution of the value.

6.	Adding Interactive Controls:

Buttons are added to play and pause the animation, along with a slider to manually select dates. The chart layout is configured to include an informative title, axis labels, and controls for the animation.

This dynamic visualization provides a clear representation of how the market value of the most valuable player evolved in the three years leading up to the Covid-19 pandemic, offering valuable insights for analysts, fans, and industry professionals.

Data analysis of the players with the highest market value in the world during covid pandemic years¶

The analysis focuses on the evolution of the market value of the player with the highest market value during the COVID-19 pandemic period from 2020 to 2023 To begin the player valuation data is merged with the player demographic data using the player_id identifier to create a complete dataset Next the player with the highest market value in the dataset is identified Once the player is identified the data is filtered to include only the market valuations during the pandemic period from January 1 2020 to January 1 2023

The datetime column is then converted to datetime format and a new date column is created containing only the date without the time ordering the data by date To visualize the evolution of this player’s market value over time an interactive figure is created using Plotly which includes line plots with markers representing the day-by-day evolution of the market value Animations are added to show how the value has evolved over time

Interactive controls are also added such as buttons to play and pause the animation and a slider to manually select dates The chart layout is configured to include an informative title specifying the player’s name axis labels and controls for the animation The resulting visualization offers a clear visual representation of how the market value of the most valuable player changed during the COVID-19 pandemic providing valuable insights for analysts fans and industry professionals

Data analysis between Killian Mbappe and Erling Haaland¶

comparative analysis between two of the most famous football players of our time: Kylian Mbappé and Erling Haaland two of the most paid player in our dataset . This analysis focuses on the red and yellow cards received by both players throughout their careers, displayed using a horizontal bar chart with subplots.

Data analysis of the market value for players grouped by position¶

The following analysis aims to examine and compare various categories of football players, namely goalkeepers, attackers, defenders, and midfielders, with respect to their market value. This question arises from the desire to understand whether having a specific role influences a player’s market value. The analysis, represented through an interactive bar chart using the Plotly library, shows that attackers have the highest market value. By hovering the cursor over the chart, we can see the maximum value they reach. Following them are midfielders, defenders, and finally goalkeepers. Despite being the ones who truly defend the team against the opponent’s goal-scoring attempts, goalkeepers have a lower market value.